30 research outputs found

    Parts and Wholes in Scene Processing

    Get PDF
    During natural vision, our brains are constantly exposed to complex, but regularly structured, environments. Real-world scenes are defined by typical part–whole relationships, where the meaning of the whole scene emerges from configurations of localized information present in individual parts of the scene. Such typical part–whole relationships suggest that information from individual scene parts is not processed independently, but that there are mutual influences between the parts and the whole during scene analysis. Here, we review recent research that used a straightforward, but effective approach to study such mutual influences: By dissecting scenes into multiple arbitrary pieces, these studies provide new insights into how the processing of whole scenes is shaped by their constituent parts and, conversely, how the processing of individual parts is determined by their role within the whole scene. We highlight three facets of this research: First, we discuss studies demonstrating that the spatial configuration of multiple scene parts has a profound impact on the neural processing of the whole scene. Second, we review work showing that cortical responses to individual scene parts are shaped by the context in which these parts typically appear within the environment. Third, we discuss studies demonstrating that missing scene parts are interpolated from the surrounding scene context. Bridging these findings, we argue that efficient scene processing relies on an active use of the scene's part–whole structure, where the visual brain matches scene inputs with internal models of what the world should look like

    Review: Object vision in a structured world

    Get PDF
    In natural vision, objects appear at typical locations, both with respect to visual space (e.g., an airplane in the upper part of a scene) and other objects (e.g., a lamp above a table). Recent studies have shown that object vision is strongly adapted to such positional regularities. In this review we synthesize these developments, highlighting that adaptations to positional regularities facilitate object detection and recognition, and sharpen the representations of objects in visual cortex. These effects are pervasive across various types of high-level content. We posit that adaptations to real-world structure collectively support optimal usage of limited cortical processing resources. Taking positional regularities into account will thus be essential for understanding efficient object vision in the real world

    Visual Imagery and Perception Share Neural Representations in the Alpha Frequency Band

    Get PDF
    To behave adaptively with sufficient flexibility, biological organisms must cognize beyond immediate reaction to a physically present stimulus. For this, humans use visual mental imagery [1, 2], the ability to conjure up a vivid internal experience from memory that stands in for the percept of the stimulus. Visually imagined contents subjectively mimic perceived contents, suggesting that imagery and perception share common neural mechanisms. Using multivariate pattern analysis on human electroencephalography (EEG) data, we compared the oscillatory time courses of mental imagery and perception of objects. We found that representations shared between imagery and perception emerged specifically in the alpha frequency band. These representations were present in posterior, but not anterior, electrodes, suggesting an origin in parieto-occipital cortex. Comparison of the shared representations to computational models using representational similarity analysis revealed a relationship to later layers of deep neural networks trained on object representations, but not auditory or semantic models, suggesting representations of complex visual features as the basis of commonality. Together, our results identify and characterize alpha oscillations as a cortical signature of representations shared between visual mental imagery and perception

    Independent spatiotemporal effects of spatial attention and background clutter on human object location representations

    Get PDF
    Spatial attention helps us to efficiently localize objects in cluttered environments. However, the processing stage at which spatial attention modulates object location representations remains unclear. Here we investigated this question identifying processing stages in time and space in an EEG and fMRI experiment respectively. As both object location representations and attentional effects have been shown to depend on the background on which objects appear, we included object background as an experimental factor. During the experiments, human participants viewed images of objects appearing in different locations on blank or cluttered backgrounds while either performing a task on fixation or on the periphery to direct their covert spatial attention away or towards the objects. We used multivariate classification to assess object location information. Consistent across the EEG and fMRI experiment, we show that spatial attention modulated location representations during late processing stages (>150 ms, in middle and high ventral visual stream areas) independent of background condition. Our results clarify the processing stage at which attention modulates object location representations in the ventral visual stream and show that attentional modulation is a cognitive process separate from recurrent processes related to the processing of objects on cluttered backgrounds

    Ultra-Rapid serial visual presentation reveals dynamics of feedforward and feedback processes in the ventral visual pathway

    Get PDF
    Human visual recognition activates a dense network of overlapping feedforward and recurrent neuronal processes, making it hard to disentangle processing in the feedforward from the feedback direction. Here, we used ultra-rapid serial visual presentation to suppress sustained activity that blurs the boundaries of processing steps, enabling us to resolve two distinct stages of processing with MEG multivariate pattern classification. The first processing stage was the rapid activation cascade of the bottom-up sweep, which terminated early as visual stimuli were presented at progressively faster rates. The second stage was the emergence of categorical information with peak latency that shifted later in time with progressively faster stimulus presentations, indexing time-consuming recurrent processing. Using MEG-fMRI fusion with representational similarity, we localized recurrent signals in early visual cortex. Together, our findings segregated an initial bottom-up sweep from subsequent feedback processing, and revealed the neural signature of increased recurrent processing demands for challenging viewing conditions

    Typical visual-field locations facilitate access to awareness for everyday objects

    Get PDF
    In real-world vision, humans are constantly confronted with complex environments that contain a multitude of objects. These environments are spatially structured, so that objects have different likelihoods of appearing in specific parts of the visual space. Our massive experience with such positional regularities prompts the hypothesis that the processing of individual objects varies in efficiency across the visual field: when objects are encountered in their typical locations (e.g., we are used to seeing lamps in the upper visual field and carpets in the lower visual field), they should be more efficiently perceived than when they are encountered in atypical locations (e.g., a lamp in the lower visual field and a carpet in the upper visual field). Here, we provide evidence for this hypothesis by showing that typical positioning facilitates an object's access to awareness. In two continuous flash suppression experiments, objects more efficiently overcame inter-ocular suppression when they were presented in visual-field locations that matched their typical locations in the environment, as compared to non-typical locations. This finding suggests that through extensive experience the visual system has adapted to the statistics of the environment. This adaptation may be particularly useful for rapid object individuation in natural scenes

    The Spatiotemporal Neural Dynamics of Object Recognition for Natural Images and Line Drawings

    Get PDF
    Drawings offer a simple and efficient way to communicate meaning. While line drawings capture only coarsely how objects look in reality, we still perceive them as resembling real-world objects. Previous work has shown that this perceived similarity is mirrored by shared neural representations for drawings and natural images, which suggests that similar mechanisms underlie the recognition of both. However, other work has proposed that representations of drawings and natural images become similar only after substantial processing has taken place, suggesting distinct mechanisms. To arbitrate between those alternatives, we measured brain responses resolved in space and time using fMRI and MEG, respectively, while human participants (female and male) viewed images of objects depicted as photographs, line drawings, or sketch-like drawings. Using multivariate decoding, we demonstrate that object category information emerged similarly fast and across overlapping regions in occipital, ventral-temporal, and posterior parietal cortex for all types of depiction, yet with smaller effects at higher levels of visual abstraction. In addition, cross-decoding between depiction types revealed strong generalization of object category information from early processing stages on. Finally, by combining fMRI and MEG data using representational similarity analysis, we found that visual information traversed similar processing stages for all types of depiction, yet with an overall stronger representation for photographs. Together, our results demonstrate broad commonalities in the neural dynamics of object recognition across types of depiction, thus providing clear evidence for shared neural mechanisms underlying recognition of natural object images and abstract drawings

    Cortical sensitivity to natural scene structure

    Get PDF
    Natural scenes are inherently structured, with meaningful objects appearing in predictable locations. Human vision is tuned to this structure: When scene structure is purposefully jumbled, perception is strongly impaired. Here, we tested how such perceptual effects are reflected in neural sensitivity to scene structure. During separate fMRI and EEG experiments, participants passively viewed scenes whose spatial structure (i.e., the position of scene parts) and categorical structure (i.e., the content of scene parts) could be intact or jumbled. Using multivariate decoding, we show that spatial (but not categorical) scene structure profoundly impacts on cortical processing: Scene-selective responses in occipital and parahippocampal cortices (fMRI) and after 255 ms (EEG) accurately differentiated between spatially intact and jumbled scenes. Importantly, this differentiation was more pronounced for upright than for inverted scenes, indicating genuine sensitivity to spatial structure rather than sensitivity to low-level attributes. Our findings suggest that visual scene analysis is tightly linked to the spatial structure of our natural environments. This link between cortical processing and scene structure may be crucial for rapidly parsing naturalistic visual inputs

    A neural mechanism for contextualizing fragmented inputs during naturalistic vision

    Get PDF
    With every glimpse of our eyes, we sample only a small and incomplete fragment of the visual world, which needs to be contextualized and integrated into a coherent scene representation. Here we show that the visual system achieves this contextualization by exploiting spatial schemata, that is our knowledge about the composition of natural scenes. We measured fMRI and EEG responses to incomplete scene fragments and used representational similarity analysis to reconstruct their cortical representations in space and time. We observed a sorting of representations according to the fragments' place within the scene schema, which occurred during perceptual analysis in the occipital place area and within the first 200ms of vision. This schema-based coding operates flexibly across visual features (as measured by a deep neural network model) and different types of environments (indoor and outdoor scenes). This flexibility highlights the mechanism's ability to efficiently organize incoming information under dynamic real-world conditions

    Recurrence is required to capture the representational dynamics of the human visual system.

    Get PDF
    The human visual system is an intricate network of brain regions that enables us to recognize the world around us. Despite its abundant lateral and feedback connections, object processing is commonly viewed and studied as a feedforward process. Here, we measure and model the rapid representational dynamics across multiple stages of the human ventral stream using time-resolved brain imaging and deep learning. We observe substantial representational transformations during the first 300 ms of processing within and across ventral-stream regions. Categorical divisions emerge in sequence, cascading forward and in reverse across regions, and Granger causality analysis suggests bidirectional information flow between regions. Finally, recurrent deep neural network models clearly outperform parameter-matched feedforward models in terms of their ability to capture the multiregion cortical dynamics. Targeted virtual cooling experiments on the recurrent deep network models further substantiate the importance of their lateral and top-down connections. These results establish that recurrent models are required to understand information processing in the human ventral stream
    corecore